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Abstract 


Consider  the  stationary  process  yt  =  /Jcos(u ;0t  -f  (j>)  -f  et,  and  a  parametric  filter  ,Ca,  and 
let  p(a)  be  the  first-order  autocorrelation  of  the  filtered  process  {Ca(y)t}.  Under  a  certain 
assumption  on  the  filtered  noise  spectrum,  p(a)  is  contractive  at  cos£u0-  It  is  shown  that  the 
sample  estimate  of  p(a),  denoted  by  p„(a )  and  obtained  from  a  finite  sample  of  length  n,  has 
with  probability  one  a  fixed  point  an  in  a  neighborhood  of  cos  uj0  ,  and  that  the  sequence  of 
fixed  points  {an }  converges  with  probability  one  to  cos  uj0 .  The  proof  is  based  on  a  general 
result  regarding  the  uniform  consistency  of  the  sample  autocorrelation.  The  developed  theory 
is  illustrated  by  two  numerical  examples  pertaining  to  two  different  parametric  time-invariant 
filters. 


Abbreviated  Title:  “CM  Method  for  Frequency  Estimation” 

Key  words  and  phrases:  Frequency  estimation,  iterative  filtering,  consistency,  fixed-point 


iteration,  secant  method,  parametric  filter,  spectrum  analysis. 


1  Introduction 


The  classical  problem  of  frequency  estimation  is  of  interest  in  a  wide  range  of  engineering  and 
scientific  applications.  The  problem  is  well-formulated  in  the  signal  processing  and  statistics 
literature,  and  has  been  studied  by  meany  researchers.  Recently  a  general  iterative  filtering 
approach,  called  the  contraction  mapping  (CM)  method,  was  suggested  by  He  and  Kedem  [6], 
Yakowitz  [18],  and  Kedem  [10]  for  estimating  the  frequency  of  a  single  sinusoid  in  additive  noise 
from  a  finite  sample  {r/0,  jq, . . .  ,  yn-i}  obtained  from  a  process  of  the  form 

yt  =  f3cos(u0t  +  $)  +  et.  (1.1) 

Here  j3  >  0  and  a>0  G  (0,  i r)  are  constants,  (p  is  uniformly  distributed  on  [0, 2i r),  i.e.,  <f>  ~  U[0, 27 r), 
and  {et}  is  a  zero-mean  stationary  process,  independent  of  <^,  with  spectral  distribution  function 
F(u)  which  is  continuous  at  u0.  The  gist  of  the  CM  method  is  as  follows.  Using  a  parametric 
filter  jCa  that  satisfies  the  so-called  fundamental  property 

Corr(£a(e)4+1,  £a(c)t)  =  a,  (1.2) 

a  sequence  of  estimators  is  constructed  by  an  iterative  procedure  of  the  form 

aj  =  Pn(aj-  i)  (1-3) 

where  /?„(«)  is  an  estimator  of  the  first-order  autocorrelation  of  { Ca(y)t }.  This  procedure 
determines  a  fixed-point  of  the  mapping  pn(a).  As  will  be  shown,  the  sequence  of  the  fixed- 
points  converges  to  coscj0  as  n  — *  oo. 

In  this  paper  we  provide  an  asymptotic  analysis  of  the  CM  method,  focusing  on  strong  (al¬ 
most  sure)  consistency.  We  shall  discuss,  for  a  given  finite  sample ,  the  existence  of  a  fixed-point 
of  pn(a),  which  will  be  referred  to  as  the  CM  estimator,  the  convergence  of  some  iterative  pro¬ 
cedures  for  finding  the  fixed-point,  and  finally,  the  consistency  of  the  fixed-point  as  the  sample 
size  tends  to  infinity.  We  shall  show,  under  appropriate  conditions,  that  the  existence,  conver¬ 
gence,  and  consistency  of  the  CM  estimator  can  be  estabilished  almost  surely  for  sufficiently 
large  sample  size,  provided  the  parametric  filter  satisfies  the  fundamental  property  (1.2),  in 
addition  to  some  fairly  mild  conditions. 

The  fundamental  property  (1.2)  required  by  the  CM  method  is  exhibited  by  many  para¬ 
metric  filters,  while  many  more  can  be  reparametrized  so  as  to  satisfy  the  property  (see,  for 
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example,  [12]).  An  important  special  case  is  the  AR(2)  filter  discussed  by  Quinn  and  Fernandes 
[16]  for  a  special  case,  and  considered  in  more  generality  in  this  paper  (see  Section  7.2).  For 
their  special  AR(2)  filter,  one  can  check  that  the  fundamental  property  holds  in  a  limiting  sense 
if  the  noise  has  a  sufficiently  smooth  spectral  density.  Using  a  variant  of  the  iterative  procedure 
(1.3),  Quinn  and  Fernandes  showed  that  with  their  AR(2)  filter  a  sequence  of  estimators  can 
be  produced  that  converges  to  the  unknown  frequency  almost  surely,  and  that  the  precision 
of  the  estimator  is  the  same  as  that  achieved  by  the  nonlinear  least  squares  (or  the  maximum 
likelihood  estimator  if  {ct}  is  Gaussian  white  noise).  The  main  contribution  of  the  present 
paper  is  the  proof  that  the  CM  method  is  strongly  consistent  for  a  much  wider  class  of  para¬ 
metric  filters  in  addition  to  the  special  case  determined  by  the  AR(2)  filter.  This,  coupled  with 
the  Quinn-Fernandes  result,  shows  that  the  CM  method  can  produce  asymptotically  efficient 
estimates. 

One  of  the  advantages  of  the  CM  method  is  its  computational  simplicity.  It  bypasses 
time-consuming  nonlinear  optimization  routines  needed  for  the  nonlinear  least  squares  method. 
Some  related  iterative  filtering  methods  for  frequency  estimation  have  been  suggested  in  the 
literature,  of  which  we  mention  the  work  of  Kay  [9],  Kumaresan,  Scharf,  and  Shaw  [13],  and 
Dragosevic  and  S.  S.  Stankovic  [4]. 

2  The  CM  Method 

Let  the  random  sequence  {yt}  be  given  by  (1.1).  Consider  a  parametric  linear  time-invariant 
causal  filter  Ca,  indexed  by  a  €  [a, a],  with  real-valued  impulse  response  sequence  {fij(ct)}|i0, 
where  a  and  a  are  constants  such  that  —  1  <  a  <  coswo  <  a  <  1.  Let  H(u> ;  a)  be  the  transfer 
function  of  Ca  defined  by 

OO 

H(u;a):=  '52hj(a)e~iju. 
j= o 

It  is  easy  to  see  that  H(u;a)  =  H(—w,a),  where  the  overline  denotes  the  complex  conjugate 
operation. 

Applying  the  filter  Ca  to  {yt}  and  {et}  yields  the  filtered  data  {yt(a)}  and  the  filtered  noise 


2 


{e*(a)}  defined,  respectively,  by 


yt(a)  :=  and  et(a)  :=  Y^hj(a)et-j- 

j= o  j= o 


(2-1) 


Let  p(a)  be  the  first-order  autocorrelation  of  (?/,(a)},  then  the  spectral  representation  of  the 
autocorrelation  function  gives 


p(a)  = 


(j2]f/(aj0;  a)|2  coscuo  +  f  \H(ur,  a)\2 cos u>  dF(u) 

_ J  —  TV _ 

^2|^(^o;«)|2  +  /  \H(u>;a)\2dF(oj) 

J  —  TV 


(2.2) 


where  a2  :=  /32 / 2  is  the  variance  of  the  signal.  For  convenience,  it  is  always  assumed  that 
\H (uj;  a)\2  dF(u)  =  0  implies  \H(uj0\&)\  =  0,  which  means  that  the  noise  cannot  be  com¬ 
pletely  removed  without  filtering  out  the  signal. 

In  the  sequel,  we  always  assume  that  for  any  a  £  [a,  a],  the  filter  £a  satisfies  the  so-called 
“fundamental  property” 


«  =  pe(a) 


E{et+1(a)et(a )} 
E{e2(a)}  ’ 


(2.3) 


that  is, 


f  \H(u>;  Of)|2cosw  dF(u) 

J  —  TV _ 

f  \H{^a)\2dF{u) 

J  —  TV 


where  pe(a)  stands  for  the  first-order  autocorrelation  of  {e((a)}.  Under  this  assumption,  (2.2) 
reduces  to 


p(a)  =  a*  +  C(a)  (a  —  a*) 


(2.4) 


where  a*  :=  coswq,  and 


C(a)  := 


1 


1  +  7(a) 

with  7(a)  being  the  signal- to- noise  ratio  of  the  filtered  data  {yt(a)}  defined  by 

a2\H{ua-a)\2 


7(a) := 


f  \H{u-a)\2dF{u) 

J  —  TV 


Clearly,  for  all  a  £  [a,  a],  0  <  C(a )  <  1  with  C(a)  =  1  if  and  only  if  |  //(u;0;  a)|  =  0,  or  if  and 
only  if  the  filter  £.a  does  not  capture  the  frequency. 
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The  original  idea  of  utilizing  (2.4)  to  estimate  a*  was  proposed  by  He  and  Kedem  [6]  for 
a  specific  filter,  known  as  the  a-filter,  and  extended  by  Yakowitz  [18]  to  any  parametric  filter 
satisfying  (2.3).  To  obtain  an  estimator  of  a *,  they  employed  the  iterative  procedure 

«!m)  ==  /5»(4m_1))  m  =  1,2,...  (2.5) 

where  pn(a )  is  an  estimator  of  p(a )  on  a  finite  sample  of  size  n.  In  the  numerical  analysis 
literature,  this  procedure  is  known  as  the  fixed-point  iteration  (FPI)  (see,  for  example,  [17]) 
which  is  used  to  find  a  fixed-point  of  /3n(a),  if  exists.  The  original  motivation  of  using  this 
iterative  procedure  to  estimate  a*  was  based  on  the  heuristic  argument  that  in  the  limiting 
case  as  the  sample  size  n  tends  to  infinity,  the  limit  of  a£"),  denoted  by  &("*),  satisfies  the 
equation 

d(m)  -  a*  =  C(a(m_1))  (a(m_1>  -  a*) 

As  can  be  seen,  the  error  in  is  contracted  by  an  amount  of  C(6fm~1'))  as  compared  with 
that  in  of”1-1),  and  hence  the  name  of  contraction  mapping  (CM)  method.  It  is  readily  seen 
that  as  m  tends  to  infinity  a ^  converges  to  a*  monotonically,  provided  C(a )  is  uniformly 
bounded  above  by  some  constant  which  is  strictly  less  than  one.  This  argument,  however,  does 
not  lead  to  the  conclusion  that  for  a  fixed  sample  size  n,  the  sequence  a would  converge 
as  m  (not  n)  tends  to  infinity,  not  to  mention  convergence  to  a*.  This  is  the  problem  we  are 
going  to  deal  with  in  the  sequel. 


3  Another  Look  at  the  CM  Method 


A  close  examination  of  (2.4)  reveals  that  if  we  define  G(a)  :=  1  —  C(a),  i.e., 


then  (2.4)  can  be  rewritten  as 


G(a) 


7(a) 

1  +  7(a)’ 


a  -  p(a)  =  G(a)  (a  -  a*).  (3.1) 

From  this  equation,  it  becomes  quite  clear  that  if  G(a)  >  0  (or,  equivalently,  7(a)  >  0,  i.e., 
the  filter  Ca  captures  the  signal)  for  all  a  in  a  neighborhood  of  a*,  then  a*  would  be  the 
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unique  fixed-point  of  p(a)  in  that  neighborhood.  This  observation  leads  to  the  somewhat  more 
general  idea  of  estimating  u0  by  finding  the  fixed-point  of  pn(a)  in  a  neighborhood  of  a*.  The 
fixed-point  of  pn(a )  will  be  refrred  to  as  the  CM  estimator.  As  will  be  seen  later,  if  taken 
to  be  the  first-order  sample  autocorrelation,  pn(a)  (and  in  fact  variants  thereof)  does  form  a 
contraction  mapping,  not  in  the  entire  interval  [a, a],  but  in  a  neighborhood  of  a*,  provided 
C(a*)  <  1.  Therefore,  we  retain  the  use  of  the  term  “contraction  mapping  (CM)  method”  for 
any  procedure  that  finds  a  fixed-point  of  /5„(o;),  or,  equivalently,  a  zero  of  the  function 


/„(q)  :=  a  -  pn{a). 

Appenrently,  (2.5)  is  only  a  special  procedure  of  this  kind.  In  fact,  one  can  used  any  algorithms 
available  in  the  numerical  analysis  literature  to  find  the  zero  of  fn(a).  For  instance,  (dj™)}  can 
be  produced  by  the  secant  method  [17] 

^(m-l)  _  ~(m- 2) 

«£"):=  - r-n - " . r  ,V  /»(«iTO~1))  m  =  1,2...  (3.2) 

which,  as  will  be  seen  later,  converges  faster  than  (2.5)  under  appropriate  conditions. 


4  Existence,  Convergence,  and  Consistency 

Based  on  a  finite  sample  {y0(a),  yfia), . . .  ,y„- i(a)},  it  is  alway  possible  to  construct  an  esti¬ 
mator  pn(a)  as  a  function  of  a  (for  instance,  by  taking  pn(a )  to  be  the  sample  autocorrelation 
of  the  filtered  data).  The  following  questions  are  of  interest: 

1)  Whether  pn(a )  has  a  fixed-point  in  a  neighborhood  of  a*; 

2)  If  it  does,  under  what  conditions  these  iterative  algorithms  converge  to  the  fixed-point; 

3)  Whether  the  fixed-point  is  consistent  for  estimating  a*  as  n  tends  to  infinity. 

In  this  section,  we  would  like  to  answer  these  questions  one  by  one. 

It  is  worth  noting  that  the  almost  sure  convergence  of  the  CM  method  has  been  recently 
proved  in  [12]  for  bandpass  filters  whose  bandwidth  shrinks  at  a  certain  rate  during  the  iteration. 
In  [12],  the  zero-crossing  rate  (ZCR)  was  used  in  connection  with  the  Gaussian  assumption, 
resulting  in  a  scheme  of  the  form  (1.3)  with  pn(a )  replaced  by  cosine  of  the  asymptotic  (i.e., 
n  =  oo )  ZRC  of  the  filtered  data. 
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4.1  Existence  of  the  CM  Estimator 


First  of  all,  let  us  investigate  conditions  under  which  pn(oi)  has  a  fixed-point  in  a  neighborhood 
of  a  specific  a0-  For  this  purpose,  we  have 

Lemma  4.1  Let  a0  be  any  fixed  number  in  [«,«].  Suppose  that  there  exist  constants  K  and 
S  with  0  <  K  <  1  and  6  >  0  such  that  Ss(a0)  :=  {a  :  |a  —  a0|  <  6}  C  [a, a]  and  that  with 
probability  tending  to  one  as  n  oo  (or  with  probability  one  for  sufficiently  large  n )  f>n(a) 
satisfies 

(a)  \pn(a')  -  p„(a") |  <  K\a'  -  a" |,  V  a',  a"  e  S6(a0 )  :=  {o  :  |«  -  o0|  <  £}; 

(b)  |a0  -  /5n(oo)|  <  (1  -  K)  8. 

Then  pn(a )  has  a  unique  fixed-point  in  Sg(a0 )  with  probability  tending  to  one  as  n  —?  oo  (or 
with  pi-obability  one  for  sufficiently  large  n). 

Proof.  The  assertions  follow  directly  from  Theorem  5.2.3  in  [17].  0 

Remark  3.1  Conditions  in  Lemma  4.1  can  be  relaxed  considerablely  by  allowing  K  and  8 
to  be  random  variables.  In  other  words,  the  conclusion  in  Lemma  4.1  remains  valid  if  conditions 
(a)  and  (b)  hold  with  probability  tending  to  one  (or  with  probability  one  for  large  n )  for  some 
random  variables  K  and  8  satisfying  0  <  K  <  1  and  8  >  0  with  probability  one.  If  this  is  the 
case,  Ss(a0)  is  apparently  a  random  interval.  <) 

Remark  3.2  Under  conditions  (a)  and  (b),  pn(a )  becomes  a  contraction  mapping  on  Sfia 0). 
In  fact,  the  contractivity  is  readily  seen  from  (a).  Moreover,  combining  (a)  and  (b)  gives 

\pn(a)-a0\  <  |p„(«)  -  p„(a0)|  +  |/5„(a0)  -  a0| 

<  K8  +  (l-I()8  =  8 

for  all  a  €  Ss(a0),  that  is,  pn(a)  maps  ^(a0)  onto  itself.  <0 

According  to  Lemma  4.1,  the  existence  of  a  unique  fixed-point  of  pn(a)  in  a  neighborhood  of 
a0  is  guaranteed  by  conditions  (a)  and  (b).  An  ideal  candidate  for  a0  in  our  problem  is  obviously 
a*.  Let  en(a )  be  the  error  of  the  estimator  pn(&)  for  estimating  p(a),  i.e.,  en(a )  pn(a)  —  p(a). 
With  this  notation,  pn(a)  can  be  written  as 

Pn(a)  -  p(a )  +  en(a). 
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For  the  existence  of  a.  unique  fixed-point  of  pn(ct)  in  a  neighborhood  of  a*,  i.e.,  the  existence  of 
the  CM  estimator,  we  have  the  following  theorem. 

Theorem  4.1  Suppose  that  on  S&(a*)  C  [a, a],  p(a)  is  continuously  differentiable  and  p„(a ) 
is  differentiable  with  probability  tending  to  one  as  n  —*■  oo  (or  with  probability  one  for  sufficiently 
large  n).  Assume  further  that  C(a)  is  continuous  at  a*  with  C(a*)  <  1.  If  pn(a)  is  uniformly 
consistent  on  SA(a*)  up  to  the  first  derivative,  i.e.,  if 


lim  sup  |en(a)|  =  0  (4-1) 

n-*°°ae  SA(«*) 

lim  sup  |ej,(a)|  =  0  (4-2) 

SA(a*) 

in  probability  ( or  with  probability  one),  then  there  exists  0  <  8  <  A  such  that  pn(cn)  has  a 
unique  fixed-point  in  Sfo" )  with  probability  tending  to  one  as  n  — »■  oo  (or  with  probability  one 
for  sufficiently  large  n). 


PROOF.  Consider  the  theoretical  function  p(a).  The  continuity  of  C{a)  at  a*  implies  that 
from  (2.4),  the  derivative  of  p(a)  at  a*  can  be  written  as 

p(a)-p(a*) 


p'ia*)  =  lim 


=  lim 


a  —  a* 
p(a)  —  a* 


<*-►«*  a  —  a* 

=  =  CK)- 


Since  0  <  C(a*)  <  1,  i.e,  7(0*)  >  0  so  that  Ca  captures  the  frequency  with  a  =  a* ,  and  since 
p\a )  is  continuous,  then  there  exists  0  <  S  <  A  such  that 


M  :=  sup  p\ol)  <  1  (4-3) 

o€5«(o*) 

m  :=  inf  p'(a )  >  0.  (  4.4) 

«e  Ss(a') 

For  any  0  <  6'  <  6,  assumption  (4.2),  together  with  (4.3)  and  (4.4),  implies  that  with  probability 
tending  to  one  as  n  — >  00  (or  with  probability  one  for  sufficiently  large  n)  there  exists  K  such 
that 


0  <  m  +  e'n(a)  <  p'n(a)  <M  +  e'n(a )  <K<  1 


(4.5) 
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for  all  a  €  5^/(0*).  By  the  mean-value  theorem,  condition  (a)  in  Lemma  4.1  is  guaranteed  with 
8'  in  place  of  8.  Moreover,  from  (4.1), 

|«*  -  />„(«*)!  =  I />(«*)  -  Pn(*)\  =  M«*)  I  <  (1  -  K)  S'  (4.6) 

with  probability  tending  to  one  as  n  — »  oo  (or  with  probability  one  for  sufficiently  large  n), 

which  gives  condition  (b)  in  Lemma  4.1.  Therefore,  by  Lemma  4.1,  />„(«)  has  a  unique  fixed- 
point  on  Ss>(a*).  Since  S'  is  arbitrary,  this  implies  that  pn(a)  has  a  unique  fixed-point  in  the 

interior  of  Sfia*).  § 

Remark  4.3  Theorem  4.1  still  holds  if  the  continuity  of  p'(a)  is  replaced  by 

\p'(a)\  <  M  <  1  V  a  €  SA(a*).  (4.7) 

However,  this  weaker  condition  does  not  guarantee  the  positivity  of  p'n(a)-  § 

R.EMARK  4.4  As  in  Lemma  4.1,  inequalities  (4.5)  and  (4.6)  imply  that  pn(a)  constitutes  a 
contraction  mapping  in  ^(o*)  with  probability  tending  to  one  as  n  — »  oo  (or  with  probability 
one  for  sufficiently  large  ra).  It  is  made  possible  basically  by  the  requirement  that  the  pn(a )  be 
uniformly  consistent  up  to  the  first  derivative  and  that  the  filter  Ca  passes  the  frequencj/'  for 
all  a  in  the  vicinity  of  a*.  § 

4.2  Convergence  of  Two  Iterative  Algorithms 

Let  an  be  the  CM  estimator,  i.e.,  the  fixed-point  of  pn(a )  in  S^cC)  given  by  Theorem  4.1.  The 
following  theorem  guarantees  that  under  suitable  conditions,  the  FPI  procedure  (2.5)  can  start 
anywhere  in  a  neighborhood  of  an  so  as  to  converge  to  an. 

Theorem  4.2  Under  the  conditions  in  Theorem  4.1,  there  exist  constants  80  >  0  and  0  <  K  < 
1  such  that  with  probability  tending  to  one  as  n  ->  oo  (or  with  probability  one  for  sufficiently 
large  n)  the  sequence  |d(;n)}  defined  by  (2.5)  stays  in  S6o(an )  and  converges  at  least  linearly  to 
an  as  m  tends  to  infinity,  provided  €  5'io(a„).  Moreover,  the  convergence  is  monotone  and 

I4m)-A„|  <  A|a(r1)-an|  (4.8) 


for  any  m  >  1. 
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PROOF.  Since  dn  £  S6(a*),  there  exists  S0  >  0  such  that  Sj0(dn)  C  Ss(a*).  From  the  proof 
of  Theorem  4.1,  a  constant  K  can  be  found  so  that  0  <  K  <  1  and  (4.5)  holds  for  all  a  £  S$0(d„). 
According  to  the  mean-value  theorem,  this  implies  that  condition  (a)  of  Lemma  4.1  holds  for 
all  a'  and  a "  in  S^0(d„).  In  particular, 

\pn(a)  -  an\  <  K\a  -  an\  V  a  £  SSo(a„). 

Therefore,  a £  15'{0(d„)  for  any  m  >  1,  provided  £  Ss0(dn).  The  inequality  (4.8)  and 
thus  the  convergence  of  {d(4mJ}  are  consequences  of  condition  (a)  in  Lemma  4.1  with  Ss0(dn)  in 
place  of  Ss(a0).  The  monotonicity  is  due  to  the  fact  that  p'n(a )  is  positive  in  Ss0(dn).  0 

Remark  4.5  Theorem  4.2  remains  valid  if  K  and  S0  are  allowed  to  be  random.  And  if  the 
continuity  of  p\a)  is  replaced  by  (4.7),  the  convergence  of  {d(1m'1}  still  holds  but  the  monotonicity 
is  not  guaranteed  any  more.  C> 

It  is  evident  from  (4.3),  (4.5),  (4.6),  and  (4.8)  that  the  rate  of  convergence  of  the  iterative 
procedure  (2.5)  is  governed  by  the  constant  C(a*)  (known  as  the  contraction  coefficient)  and 
the  estimation  accuracy  of  p„(a)  up  to  the  first  derivative.  Usually,  the  estimation  accuracy 
depends  heavily  on  the  sample  size  n  which  one  can  not  control.  What  one  can  do  to  accelerate 
the  convergence  is  to  make  C(a*)  as  small  as  possible,  by  using  appropriate  filters  during;  the 
iteration.  Since  decreasing  C(a*)  is  equivalent  to  increasing  j (a*),  the  signal-to-noise  ratio 
after  filtering  with  £a-,  the  convergence  would  be  accelerated  if  the  signal  could  be  enhanced 
in  an  appropriate  way.  Since  a*  is  unknown,  the  only  possibility  of  enhancing  the  signal  during 
the  iteration  relies  on  d^m\  Some  strategies  were  discussed  in  [10,  12]  upon  shrinking  the 
bandwidth  of  filters. 

Now  let  us  consider  the  secant  method  defined  by  (3.2).  Under  proper  conditions,  this 
method  has  superlinear  convergence.  To  study  its  convergence,  the  second  derivatives  of  p(a) 
and  en(o)  are  required. 

Theorem  4.3  Under  the  conditions  in  Theorem  4-1,  if  in  addition  there  exists  0  <  60  <  S  such 
that  p{ol)  and  en ( a )  have  second  derivatives  that  are  uniformly  bounded  by  D  on  Ss0(an)  with 
probability  tending  to  one  as  n  — ►  00  (or  with  probability  one  for  sufficiently  large  n),  then, 
starting  with  df~l\  a ^  £  Ss,(dn)  where  (Iq  :=  min{(l  —  K)/D,80}  and  K  is  given  by  (4.5), 
the  sequence  generated  by  the  secant  method  (3.2)  converges  at  least  superlinearly  to  an 
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with  probability  tending  to  one  as  n  — *■  oo  (or  with  probability  one  for  sufficiently  large  n),  and 
for  some  c  >  0  and  0  <  A  <  1, 

\a^-an\  <c  (4.9) 


where 


p(m )  := 


l  +  \/5\ 

'  2  J 


Proof.  Note  that  (4.5)  and  the  boundedness  of  second  derivatives  imply 

fn'H  =  p"(a)  +  eZ(a)  D 
2/'(«)  2(1  —  p'n(a))  -  1-K 

for  all  a  £  Ssl(a„).  The  remaining  proof  follows  the  argument  in  [17]  (pp. 292-293).  ■0> 

Remark  4.6  The  secant  method  may  require  better  initial  estimates  than  the  FPI  procedure 
to  assure  its  convergence.  Moreover,  the  secant  method  is  numerically  not  as  stable  as  the  FPI, 
although  it  converges  faster  with  proper  initial  estimates.  0 

4.3  Consistency  of  the  CM  Estimator 

Suppose  that  in  a  neighborhood  Sf(a*)  of  a*,  the  function  pn(a )  has  a  fixed-point  a„  with 
probability  tending  to  one  as  n  — *■  oo  (or  with  probability  one  for  sufficiently  large  n).  We  are 
interested  in  the  consistency  of  an,  referred  to  as  the  CM  estimators,  as  n  tends  to  infinity.  For 
this  purpose,  we  notice  that  from  (3.1)  we  obtain 

G(an)  (an  -a*)  =  en(an)  (4.10) 

with  probability  tending  to  one  as  n  — »  oo  (or  with  probability  one  for  sufficiently  large  n). 
Clearly,  the  behavior  of  an  depends  entirely  on  that  of  G(a )  (known  as  the  gain  coefficient) 
and  of  e„(a)  in  Sg(a*).  For  the  consistency  of  d„,  we  have  the  following  results. 

Theorem  4.4  Let  an  be  the  fixed-point  of  pn(a)  in  Ss(a*).  Suppose  that  en(a )  satisfies  (4.1) 
in  probability  (or  with  probability  one )  and  that  G(a)  >  g  for  some  g  >  0  and  all  a  £  Sg(a*). 
Then  an  converges  to  a*  in  probability  (or  with  probability  one )  as  n  — >  oo.  In  both  cases,  the 
convergence  is  also  in  mean-square. 

Proof.  The  convergence  in  probability  (or  with  probability  one)  follows  immediately  from 
(4.1)  and  (4.10).  The  mean-square  convergence  is  due  to  the  boundedness  of  an.  <0> 
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5  Consistency  of  Sample  Autocorrelation 


As  seen  in  the  proceeding  section,  the  consistency  assumption  of  (4.1)-(4.2)  plays  an  important 
role  in  the  proof  of  existence,  convergence,  and  consistency  of  the  CM  estimators.  To  verify 
this  assumption,  we  specialize  in  this  section  to  the  usual  sample  autocorrelation  p„(a),  as 
will  be  defined  later,  and  investigate  its  consistency  when  the  filter  Ca  satisfies  the  following 
conditions: 


(HI)  {hj}  {0}  and  hj(a)  =  0  for  j  <  0. 

(H2)  There  exist  constants  a,j  >  0  such  that 

OO 

<  oo  and  |hj(tt)|  <  a,j 
j= o 

for  all  j  =  0, 1, . . .  and  all  (t£i,  where  A  is  a  closed  subset  of  [a,  a]. 
(H3)  /i)(a)  exists  and  there  are  constants  bj  >  0  such  that 


for  all  j  =  0,1,. 


Y2ih3  <  oo  and  |fi'(a)|  <  bj 

3= 0 

and  all  a  €  A. 


These  conditions  can  be  easily  fulfilled  by  many  commonly-used  filters.  Some  examples  will  be 
given  in  the  next  section. 

Given  a  finite  sample  {y0,  r/i, . . .  ,  yn- 1}  of  size  n,  let  {^(cc)}  be  the  filtered  data  defined  by 

t 


ilt(ot)  :=  J>,(«)  yt-j  <  =  0, 1, 1. 

j= o 


(5.1) 


On  the  basis  of  {yt(a)},  a  widely-used  estimator  of  the  first-order  autocorrelation  p(a)  is  the 
least  squares  (LS)  estimator  that  minimizes  Y^t=i[yt(a)  ~  P  yt-i(a)]2,  he., 

rfia) 


Pn(a) 


fo(a) 


(5.2) 


where  r0(a )  and  rfia)  are  sample  variance  and  first-order  autocovariance  of  {fjt(o-)}  defined  by 


fo(«)  — 

n  t= i 
^  n  —  1 

rfia)  :=  -  J2  &(°0  Vt-M)- 

n  t= i 


(5.3) 

(5.4) 
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We  would  like  to  show  that  if  (H1)-(H3)  are  satisfied  in  a  neighborhood  of  a*,  then  (4.1)  and 
(4.2)  hold  with  probability  one,  provided  {et}  is  a  linear  process,  i.e., 

OO 

<4  =  Yj  (5-5) 

j=—oo 

where  {zt}  are  IID(0, 1)  and  <  oo.  In  this  case,  the  noise  spectral  distribution  function 

F  is  given  by 


dX 


where  i  :=  \/—  1 .  Before  stating  this  result,  the  following  lemmas  are  needed. 


Lemma  5.1  Let  -jc*}  be  a  linear  process  defined  by  (5.5).  Suppose  that  {^(o)}  and  {/ij(a)} 
satisfy  assumptions  (HI)  and  (H2).  Then  as  n  — >  oo, 

n  —  T—l  t i-f-r  \  /  t  \ 


t= 0 


Y,hj(a)yt_j+T  I  \Y,9j(a)y‘ 


t-j 


u'= o 


u=° 


(5.6) 


a4'  o2  ft  [H  (w0;  a)  G(u>o;  a)eiTUo }+  f*  a)  G(u;  a)eiTU)  dF(u>) 

J  —  7T 

uniformly  in  a  €  A,  where  r  >  0, 

OO 

G(u\a)  :=  J^(a)e_<i" , 
l=o 

and  §?{•}  stands  for  the  real  part  of  a  complex  number. 

Proof.  See  Appendix  A.  <> 

Remark  6.1  Lemma  5.1  can  be  generalized  to  the  case  of  multiple  sinusoids  in  noise.  In 
this  case,  the  observation  {yt}  is  given  by 

q-l 

Vt  =  ]T/?iCos(n;fci  +  <fo)  +  et  (5.7) 

k=  0 

where  q  >  1,  /3k  >  0  and  0  <  uj0  <  •  •  •  <  u>q~i  <  7 r  are  constants,  {<fik}  are  iid  U[0,  27t)  and 
independent  of  {e(j.  Under  the  same  conditions  as  Lemma  5.1,  it  can  be  shown  that  as  n  tends 
to  infinity, 

n  —  T  —  l  /  t+r  \  /  t  \ 

J2  \^hj(a)yt-i+r  I  I  £  5, ■(«)&-*  ]  (5-8) 

t= 0  \i=o  /  \j=o  / 


YL  °k  K  {n  (WU  a)  G{uk\ o)e’v“"'}  +  f  H(u>;a)G(u-,a)ei™  dF(u) 
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uniformly  in  a  £  A,  where  rr2k  :=  fiU'2  is  the  variance  of  the  fcth  sinusoid.  In  fact,  using  the  same 
method,  the  counterparts  of  n-1  ICnTo7"*1  h(t)  and  n-1  X]"=oT_1  in  the  proof  of  Lemma  5.1 
can  be  shown  to  have  the  same  limits  given  by  (A. 6)  and  (A. 7),  and  in  the  counterpart  of 
n~l  YTt:r-  A(0  the  cross-product  terms  with  different  frequencies  converge  to  zero  as  n  oo 
since 

n— 1  n— 1 

cosLVktcosoJk't  —  0(1),  Sinn;*! sinwjfc/t  =  0(1) 

t=o  t= o 

for  k  ^  k'  and 

n  —  1 

^2  sin u>kt  cosuk,t  =  0(1) 
t= o 

for  any  k  and  k' .  <0 

Remark  6.2  In  the  proof  of  Lemma  5.1,  the  assumption  that  4>  ~  U[0, 27r)  is  not  necessary. 
It  is  required  only  if  we  want  {yt}  to  be  stationary.  This  remark  also  applies  to  the  case  of 
multiple  sinusoids.  <0 


Denote  by  r0(a)  and  ri(a)  the  variance  and  the  covariance  of  the  filtered  data  {yt(a)} 
defined  by  (2.1),  respectively.  Then,  we  have 

r0(a)  =  a2\H(u0;a)\2  +  f  \H(u;  a)\2dF(u) 

J  —  it 

=  0-2  \Z2hi(a)e~ijw\  +j2J2hj(a)h'=(ayk-j 

ri(a)  =  a2|/f(o;o;a)|2cosa;o  +  /  \H(u;a)\2cosw  dF(u>) 

J  —  It 

=  a2  |^/ii(a)e~!^|  cosw0  +  ^  ]T  hj{a)hk(a)rl_i+l. 


It  is  readily  seen  that  under  assumptions  (H2)  and  (H3),  r0(a)  and  r^(a)  are  differentiable  with 
respect  to  a  and  their  derivatives,  denoted  by  r'0(a)  and  r[(a),  respectively,  can  be  easily  shown 
to  be 


r'0(a)  =  2cr2  $1  a)H(oj0;  a)\  +  2  f  H'{u\a)H{u\a)dF{ijj) 

J  —  It 

r\(a)  —  2<j2  3?  a)H(u0-,  a)|  coswq  +  2  f  H'(u\  a)H(u);  a) cos lo  dF(u>) 

y  J  —  7T 

where 
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Based  on  Lemma  5.1,  we  are  now  able  to  obtain  consistency  of  the  sample  variance  f0(a), 
the  sample  covariance  7^(0),  and  their  derivatives,  as  estimators  of  r0(a),  rt (a),  and  their 
derivatives.  For  this  purpose,  we  have 

Lemma  5.2  Let  {et}  be  a  linear  process  defined  by  (5.5).  Suppose  that  \hj(a)}  satisfies  (Hl)- 
(H3).  Then  as  n  tends  to  infinity, 

r0(a)  4'  r0(a),  rfia)  4  rj(a) 
r'0(ot)  4  r'Q(a),  and  r'fia)  4  r'fia) 

uniformly  in  a  £  A. 

PROOF.  See  Appendix  B. 

Using  these  lemmas,  we  are  able  to  obtain  the  uniform  strong  consistency  of  pn(a)  and 
p'n(a)  as  follows. 

Theorem  5.1  Let  {et}  he  a  linear  process  defined  by  (5.5)  and  suppose  that  assumptions  (Hl)- 
(H3)  are  satisfied.  Then  as  n  —>  oo, 

pn(a)  4  p(a)  and  p'n(a)a-i-  p' (a) 

uniformly  in  a  £  A,  where  pn(a)  is  defined  by  (5.2). 

Proof.  From  the  definition  (5.2)  of  pn(a)  and  the  fact  that 

S/  r„\  _  r0{a)r[(a)  -  rfia)r'0(a) 

M  ’  ~  fl(a) 

the  assertion  in  this  theorem  follows  immediately  upon  using  Lemma  5.2.  <) 

Remark  6.3  There  are  many  other  commonly-used  estimators  of  p(a).  For  example,  pn(a) 
can  be  defined  as  the  minimizer  of 

_  M*- i(a)]2  +  a(«)  -  pyt-i(a)]2, 

t-1  t-l 

yielding 

n  —  1 

J2  yt-i(a)[yt(a)  +  yt- 2(«)] 

Pn(a)  =  ^ - — r - .  (5.9) 

2YjVt- 1(«) 

t= 1 


14 


It  can  be  shown  by  a  similar  argument  as  in  the  proof  of  Lemma  5.2,  all  these  estimators  are 
uniformly  equivalent  as  n  tends  to  infinity,  and,  therefore,  Theorem  5.1  remains  valid  for  these 
estimators.  0 

As  a  consequence  of  this  theorem,  the  results  obtained  in  Section‘4  can  be  restated  as 
follows: 

Corollary  5.1  Let  {et}  be  a  linear  process  defined  by  (5.5).  Suppose  that  C(a*)  <  1  and  (111)- 
(H3)  are  satisfied  with  A  :=  S&(a*)  C  [a,  a]  for  some  A  >  0.  Assume  further  that  {h'fia)}  are 
continuous  on  S&(a*).  Then  the  following  results  hold  for  sufficiently  large  n  with  probability 
one: 

(a)  pn(a )  has  a  unique  fixed-point  an  in  some  Ss(a*)  C  A. 

(b)  There  exists  S(0(an )  C  S((a*)  such  that  the  sequence  {a^}  given  by  (2.5)  converges 
monotonically  and  at  least  linearly  to  an  as  m  — .  oo,  provided  £  S(0(an).  In  this 
case,  (4.8)  also  holds  for  any  m  >  1. 

(c)  If  in  addition  G(a)  >  g  >  0  for  all  a  £  5«(a*),  then  an  converges  to  a*  with  probability 

one  as  n  — >  oo. 

Proof.  According  to  Theorem  4.2  and  Theorem  4.4,  it  suffices  to  check  the  conditions  in 
Theorem  4.1.  To  do  this,  we  first  notice  that  the  differentiability  of  p(a)  and  the  continuity  of 
C(a )  are  guaranteed  by  (112),  (H3),  and  the  continuity  of  h'(a).  Moreover,  (4.1)  and  (4.2)  are 
consequences  of  Theorem  5.1.  O’ 

For  the  convergence  of  the  secant  method  (3.2),  an  additional  condition  on  the  second 
derivative  of  hj(a)  is  needed: 

(H4)  h'f(a)  exists  and  there  are  constants  Cj  >  0  such  that 

OO 

<  oo  and  \h'f(a)\  <  Cj 
j= o 

for  all  j  =  0, 1, . . .  and  all  a  £  A. 

So,  the  secant  method  is  more  stringent  than  the  FPI.  By  a  similar  argument,  it  is  readily  seen 
that  under  (H4),  p(a)  is  twice  differentiable  and  p”(a)  is  a  uniformly  consistent  estimator  of 
p"{cx.).  Therefore,  we  also  have 
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Corollary  5.2  Suppose  that  the  conditions  in  Corollary  5.1  hold.  If  in  addition  assumption 
(H4)  is  satisfied  and  p"(ot)  is  bounded  on  A  :=  SA(a*),  then  there  exists  61  >  0  such  that  the 
sequence  }  given  by  (3.2)  converges  to  an  as  m  ^  oo  at  least  superlinearly  with  probability 
one  for  sufficiently  large  n,  provided  d(~D,  €  5j,(an). 

Proof.  The  assertions  follow  immediately  from  Theorem  4.3  upon  noting  that  (H4)  and 
the  boundedness  of  /)"(«)  imply  the  boundedness  of  e"(a)  with  probability  one  for  sufficiently 
large  n. 


6  Multiple  Sinusoids  in  Noise 

In  previous  sections,  most  of  the  results  were  restricted  to  the  case  of  a  single  sinusoid  in  noise 
defined  by  (1.1).  Now  let  us  consider  the  general  case  of  multiple  sinusoids  in  noise  given 
by  (5.7).  We  would  like  to  discuss  conditions  for  which  the  CM  method  provides  consistent 
estimates  of  the  unknown  frequencies. 

We  first  notice  that  for  the  observation  {yt}  defined  by  (5.7),  the  first-order  autocorrelation 
p(a)  of  filtered  data  can  be  expressed  as 

Xy  al\H(uk\  a)|2  cosu;*  +  f  \H(u;  a)|2  cos  ui  dF(u>) 

p(a)  =  - — - 


J2^l\Hi^k;a)\2  +  f  \H(u;a)\2dF(u) 

fc= o 


Define  a*k  :=  cos  04.  Then,  under  assumption  (2.3),  this  expression  reduces  to  a  counterpart  of 
(2.4) 


q-l 


p(a)  =  of  +  J2Gi(a)(a*j  ~  al )  +  C'?(«)(«  -  ®t) 
j= 0 


(6-1) 


where 


C,{ci)  := - py— 

1  +  '^2lfk(a) 


,  Gj(a)  := 


7i(a) 


k=zO 

.2 


0-1 

1  +  Z)7t(«) 

k—0 


7t(«)  :=  “ft 


of|2T(W*;a)|a 


r  \H{w,a)\2dF(u) 

J  —  7T 
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Clearly,  -yfc (cv)  is  the  signal-to-noise  ratio  of  the  /cth  sinusoid  after  filtering  with  £„,  Gj(a )  >  0 
is  the  gain  coefficient  of  the  jth  sinusoid,  and  Cq(a)  is  the  contraction  coefficient  satisfying 
0  <  Cq(a)  =  1  —  o  Gj(a)  <  1-  In  particular,  from  (6.1),  we  have 

P(a  1)  =  al  +  EGi(ai)K  ~  “*)• 

j= o 

Suppose  that  Cq(a)  is  continuous  and  Gj(a)  is  differentiable  at  Therefore,  p(a)  is  also 
differentiable  at  of  and  its  derivative  can  be  written  as 


Clearly,  a  sufficient  condition  for  p(a)  to  be  contractive  in  a  neighborhood  of  a*k  is  that  p'(a) 
is  continuous  in  the  vicinity  of  a*k  and 


-1  <  £»;)(<**  -al)  +  Cq(al)  <  1.  (6.2) 

jjik 

The  following  theorem  guarantees  the  existence  of  a  unique  fixed-point  of  pn(a)  in  a  neighbor¬ 
hood  of  a*k  and  the  convergence  of  iterative  procedures  (2.5)  and  (3.2). 

Theorem  6.1  Under  the  conditions  in  Theorem  f.l  about  p(a)  and  en(a),  if  ( 6.2)  is  satisfied, 
then  there  exist  0  <  Si  <  80  <  8  <  A  such  that  the  following  results  hold  with  probability  tending 
to  one  as  n  — oo  (or  with  probability  one  for  sufficiently  large  n ). 

(a)  pn(a)  has  a  unique  fixed-point  an  in  Sg(a*k). 

(b)  The  sequence  {«W}  defined  by  (2.5)  converges  to  an  at  least  linearly  as  m,  —?  oo,  provided 
«i0)  €  S6a(dn). 

(c)  If  the  assumptions  in  Theorem  f.3  about  p(a)  and  en(a)  are  also  satisfied,  the  sequence 

defined  by  (3.2)  converges  to  an  at  least  superlinearly  as  m  — »■  oo,  provided  d4-1), 

dn0)  e  SSl(&n)- 
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Proof.  The  assertions  can  be  proved  in  the  same  way  of  showing  Theorems  4.1,  4.2,  and 
4.3.  0 


Let  d„  be  a  fixed-point  of  pn(a)  in  Sf(a£).  From  (6.1),  we  obtain 

q-l 

^2Gj(an)(an  -  atj)  =  e„(d„).  (6.3) 

j=0 

Unlike  the  case  of  a  single  sinusoid,  this  equation  does  not  lead  to  the  conclusion  that  an  — >  a*k , 
even  under  the  assumption  that  G<.(a)  >  g  >  0  for  all  a  £  5j(a£).  In  order  to  achieve 
consistency,  a  much  stronger  condition  is  required  that  prevents  an  from  converging  to  false 
frequency.  Obviously,  the  condition  Gj(a)  —  0  for  all  a  ^  Sj(ak)  does  the  job.  This  condition 
simply  says  that  those  sinusoids  with  frequencies  different  from  a £  must  be  completely  filtered 
out  by  Ca  with  a  in  the  vicinity  of  a*k.  In  conclusion,  pn(a )  may  still  have  a  unique  fixed-point 
in  the  vicinity  of  ajj  and  iterative  procedures  such  as  (2.5)  and  (3.2)  may  still  converge  to  that 
fixed-point  under  condition  (6.2)  and  uniform  consistency  of  pn(a).  However,  the  fixed-point  is 
not  necessarily  a  consistent  estimator  of  a*k,  unless  other  frequencies  can  be  completely  cleaned 
up  by  Ca  with  a  in  the  vicinity  of  a*k  (see  also  [6,  10,  12]).  This  clearly  requires  that  the 
frequencies  be  well-separated. 


7  Examples 

This  section  gives  two  examples  of  parametric  filters  that  can  be  applied  in  frequency  estimation 
using  the  CM  method. 


7.1  The  a-Filter 

The  exponentially- weighted  moving  average  filter,  or,  the  “a- filter”,  was  originally  studied  by 
He  and  Kedem  [6]  for  frequency  estimation.  It  can  be  defined  recursively  by 


yt(a)  =  ayt_  j(o)  +  yt 


where  — 1  <  a  <  1.  It  is  easy  to  see  that  hj(a)  =  0  for  j  <  0,  hj(a)  =  eP  for  j  —  0, 1, . . .,  and 


\H(uj;a)\2 


_ 1 _ 

1  —  2 a  cos  w  +  a2 
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Table  1:  Comparison  of  FPI  and  Secant  Procedures  with  a-Filter 


FPI  [initial  value=  0.67r] 

Secant  [initial  value= 

=  (0.3tt,  0.67r)] 

m 

MEAN  ±  SDV  ( x  7r) 

MSE  (X7T2) 

MEAN  ±  SDV  (xtt) 

MSE  (X7T2) 

i 

0.495854  ±0.016533 

6.02714  xlO"3 

0.495854  ±0.016533 

6.02714  xlO-3 

2 

0.447814  ±0.017789 

1.09008x  10-3 

0.421869  ±  0.020007 

4.03792  xlO-4 

3 

0.430492  ±0.017368 

4.12172xl0-4 

0.420337  ±  0.017723 

3.14232  xlO"4 

4 

0.424347  ±0.017484 

3.24577  xlO-4 

0.420744  ±0.018096 

3.28011  xlO-4 

5 

0.422112  ±0.017713 

3.18207  x  10-4 

0.420741  ±0.018092 

3.27851  xlO-4 

6 

0.421277  ±0.017881 

3.21364  xlO"4 

0.420741  ±0.018092 

3.27850  xlO"4 

Some  other  interesting  properties  of  the  a-filter  can  be  found  in  [11]. 

Clearly,  0  <  C(a*)  <  1,  and  (H1)-(H3)  are  satisfied  on  any  closed  subinterval  of  (—1,1). 
When  {et}  is  white,  it  can  be  easily  shown  [6]  that  the  fundamental  property  (2.3)  holds  for  all 
a  £  (  —  1, 1).  Consequently,  this  filter  can  be  applied  to  estimate  w0  using  iterative  procedures 
(2.5)  or  (3.2).  In  particular,  when  {et}  ~  IID(0,a2),  Corollary  5.1  and  Corollary  5.2  guarantee 
the  convergence  of  these  procedures,  and  also  the  strong  consistency  of  their  limits  for  estimating 
a*.  To  illustrate  the  performance  of  the  a-filter,  the  FPI  procedure  (2.5)  and  the  secant  method 
(3.2)  are  applied  in  the  estimation  of  a  single  sinusoid  in  additive  Gaussian  white  noise  defined 
by  (1.1)  with  (jj0  =  0.427T,  <f>  —  0.17T,  and  SNR=  3  dB.  The  initial  frequency  estimate  is  taken  to 
be  0.67T  (or  =  cos0.67r)  for  the  FPI  and  (0.37T,  0.67t)  (or  d^-1)  =  cos0.37r,  d^  =  cos0.67t) 
for  the  secant  method.  Table  1  presents  estimated  ensemble  averages  based  on  200  independent 
realizations  of  size  n  =  100,  where  m  stands  for  the  number  of  iterations,  and  the  mth  estimate 
of  is  defined  by 

4"0  :=  arccos/5n(d[1m_1))  =  d^m)  m  =  1,2, -  (7.1) 

As  can  be  seen  in  this  table,  the  FPI  and  the  secant  methods  work  equally  well  in  this 
experiment,  but  the  latter  converges  slightly  faster  at  the  beginning. 
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7.2  The  AR(2)  Filter 

Considerable  improvements  in  estimation  accuracy  can  be  achieved  by  the  following  AR(2) 
filter  defined  recursively  by 

yt(a)  +  e(a)rjyt_1(a)  +  i fyt_2(a)  =  yt  (7.2) 


where  0  <  r)  <  1  and 


0(a)  := 


1  +  T]2 

- a 

0 


When  {e*}  ~  IID(0,cr^),  it  is  easy  to  verify,  using  a  formula  given  in  [7],  that 


a(«)  =  =  « 

for  all  a  £  [a,  a]  with 


a  := 


1  +  tf 


and  q 


1  +  r/2' 


That  is,  the  fundamental  property  (2.3)  is  satisfied  by  the  AR(2)  filter  (7.2).  Its  two  poles  are 
readily  seen  to  be 


Ci(«)  :=  f  (-*(«)  + ^-F(«)), 

(2(a)  :=  |  (~0(a)  -  iy/4-92(a)j 

with  |Ci(a)|  =  |C2(o)|  =  i)  for  all  a  £  [a,  a].  Clearly,  whenever  77  <  1,  the  poles  are  contracted 
within  the  unit  circle  in  the  complex  domain  so  that  the  filter  (7.2)  becomes  BIBO-stable,  i.e., 
Y  \hj(a)\  <  00.  Stability  of  (7.2)  is  extremely  important  for  the  on-line  implementation  of  the 
CM  method  in  frequency  tracking.  This  problem  will  be  addressed  separately  in  another  paper. 
Note  that  the  impulse  response  of  the  AR(2)  filter  can  be  written  as 

M°0  =  XXi^K^a)- 

3= 0 


Clearly,  (H2)  is  satisfied  if  r]  <  1,  since  |/ij.(a)|  <  a *  :=  (k  +  1)77*  for  all  k  >  0.  Moreover,  since 

1  +  V2  (  ,  ,  •  0(a)  \ 


C((«)  =  - 
G(«)  =  - 


2 

1  +  ?72 


-1  +  i 


-1  -  i 


0(q)  \ 

n/4  -02(a)7  ’ 
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Table  2:  Estimates  by  FPI  with  ARi 

2)  Filter  (Initial  Value  =  0.557t) 

77  =  0.95 

77  =  0.98 

m 

MEAN  ±  SDV  (xjt) 

MSE  ( x  7r2) 

MEAN  ±  SDV  ( x 7r) 

MSE  (xtt2) 

1 

0.498308  ±  0.012725 

6.29414x  10-3 

0.516345  ±  0.013724 

9.47067  xlO-3 

3 

0.427856  ±  0.007107 

1.12234  x  10-4 

0.454814  ±  0.022079 

1.69947  xlO"3 

6 

0.419949  ±  0.000874 

7.67307  x  10~7 

0.421551  ±  0.002866 

1.06195  xlO-5 

8 

0.419914  ±  0.000867 

7.59110x  10-7 

0.420060  ±  0.000841 

7.10807  x  10~7 

10 

0.419913  ±  0.000867 

7.59001  xlO-7 

0.419925  ±  0.000795 

6.37827  xlO"7 

and  |0(a)|2  <  4  for  all  a  with  |0(a)|2  =  4  if  and  only  if  a  =  a  or  a  =  a,  then  |Ci(°)l  anc^  ICsil0)! 
are  uniformly  bounded  by  a  constant  c  for  all  a  €  A  [a  ±  <*>,  &  —  <5]  where  6  >  0.  Note  that, 
with  obvious  notation, 

km  =  + E(*  - 

j—0  j= 0 

and  thus 

k  k 

|fc*(«)l  <  cSjV”1  +  cS(*_jV_1 

y=o  j= o 

=  6*  :=  +  l)r7fc_1 


for  all  k  >  0  and  a  G  A.  Assumption  (H3)  is  clearly  satisfied  if  ? 7  <  1.  It  can  be  shown 
similarly  that  (H4)  is  also  valid  under  the  same  condition.  As  a  consequence,  Corollary  5.1  and 
Corollary  5.2  apply  to  the  AR(2)  filter  for  estimating  w0.  (Note  that  \H(ur,a)\2  >  0  for  all  u 
and  a,  and  hence  C(a*)  <  1.) 

To  demonstrate  its  performance,  we  apply  the  AR(2)  filter  with  the  FPI  procedure  to  the 
same  data  as  in  Section  7.1,  and  Table  2  presents  the  results  for  7/  =  0.95  and  0.98. 

As  compared  with  the  a-filter,  the  AR(2)  filter  provides  greater  precision  for  frequency 
estimation  in  terms  of  smaller  variance  and  mean-square  error  (MSE).  This  is  because  the 
AR(2)  filter  is  bandpass  and  enhances  considerablely  the  frequence  components  in  the  vicinity 
of  the  angles  of  its  ploes,  which,  when  a  =  a*  and  7?  ~  1 ,  are  approximately  ±U70-  The  role  of  77 
can  be  appreciated  by  comparing  the  results  for  77  =  0.95  and  those  for  77  =  0.98.  Clearly,  the 
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closer  the  parameter  77  is  to  1,  the  smaller  the  variance  and  the  MSE,  but  the  lower  is  the  rate 
of  convergence. 

Finally,  it  is  worth  pointing  out  that  in  the  extreme  case  when  77  =  1,  the  CM  method  using 
the  FPI  procedure  with  the  AR.(2)  filter  (7.2)  coincides  with  the  method  proposed  by  Quinn  and 
Fernandes  [16].  They  have  shown  that  in  this  case,  the  estimator  u)n  is  n3/,2-consistent,  just  like 
the  nonlinear  least  squares  estimator.  For  77  <  1,  we  can  show  [15]  that  wn  is  n1/,2-consistent. 
However,  it  is  important  to  note  that  for  rj  =  1,  the  FPI  requires  a  much  more  accurate  initial 
estimate  than  it  does  for  77  <  1 .  In  fact,  when  77  =  1,  as  proved  in  [16],  the  accuracy  of  the 
initial  estimate  is  required  to  be  of  order  n~l .  (A  modified  method  in  [16]  can  reduce  this 
order  to  n~ C2.)  Therefore,  their  method  must  be  used  in  connection  with  another  method  that 
provides  satisfactory  initial  estimates.  On  the  other  hand,  the  CM  method  with  77  <  1  requires 
the  accuracy  of  the  initial  estimate  to  be  0(1).  By  taking  the  advantage  of  the  flexibility  for 
the  choice  of  77,  the  CM  method  does  not  require  any  other  method  for  initialization  of  the 
procedure  while  still  achieving  better  and  better  estimates  by  increasing  77,  and  eventually,  as 
77  — >  1,  obtaining  7i3/,2-consistency,  as  the  method  proposed  in  [16].  Furthermore,  with  a  flexible 
77,  the  CM  method  can  be  applied  in  the  estimation  of  a  time-varing  frequency  for  which  the 
Quinn- Fernandes  method  may  fail  due  to  an  excessively  narrow  bandwidth  that  can  easily 
“loose”  the  frequency.  Adaptive  estimation  of  time-varying  frequencies  using  the  CM  method 
will  be  addressed  elsewhere. 


A  Proof  of  Lemma  5.1 


Define 


t+T 


QW  ~  £fy(a) vt-j+r  Yj9ii.a)yt-i  ■ 


u'=° 


b=o 


From  (1.1),  Q(t)  can  be  written 


as 


Q(t)  —  I\(t)  -f  I2 (t)  +  /3(f) 
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where 


t+r  t 

Ii(t)  ■=  /32j2J2hi9kCos[u0(t-j  +  T)  +  (f>]cos[u)0(t-k)  +  (l)] 

j= 0 k=0 
t+r  t 

m  :=  /*£S>.  jgk{et-j+T  cos[a>0(/  -  k)  +  <f>]  +  et-k  cos[u;0(t  -  j  +  r)  +  <f>\} 
j= 0  jfe=0 
t+r  t 

Mi)  ■=  EE  h-jSk  ^t-j  +  T^t—k- 

j= 0 k= 0 

Here  the  argument  a  is  omitted  in  hj  and  gk  for  notational  simplicity. 

Using  the  trigonometric  identity 

cos \i  cos  A2  =  — [cos(Ai  —  A2)  +  cos(Aj  -f-  A2)], 


Ii(t)  can  be  written  as  Ii(t)  =  Ti(t)  +  T2(t)  where 

t  +  T  t 

T\ (t)  ■=  <r2'%2'52hjgkcos[w0(k- j  +  r)] 

j=  0  A: = 0 
t-f-r  t 

T2(t)  :=  a2J2YlfljSkCos[uJo(2t  -j  -  k  +  r)  +  2<j>\. 

j= 0 k= 0 

As  t  — >•  00,  assumption  (H2)  implies  that 


Ti(t)  ->  a2  hj §k  cos[u0(k  -  j  +  r)] 

j  — 0 fc=0 

f  00  00 

=  EEw(H+T)“° 


l  j=0k=0 


=  a2  »  {ifK;  «)G(u;0;  a)etT“0} 


uniformly  in  o  G  A.  Therefore, 

n  —  T  —  1 

n"1  £  ^(t)  a2  (A.l) 

<= 0 

uniformly  in  a  G  A  as  n  — >  00.  Moreover,  the  following  identity  holds  for  any  function  u(t,s) 
by  interchanging  the  order  of  summations: 

n—T— 1 t+r  t 

U  :=  £  EE  hjgku{t-  j  +  r,t-k)  (A. 2) 

*=0  ji=0fc=0 
n  —  1  j  —  t—  1  n  —  j  —  1 

=  £  £  £  hj9ku(t,t+ j-k-r) 

j—T  k=0  t— 0 

n  —  T  —  1 n  —  k  —  T  —  l 

+  £  £  £  hiSku(t  +  k  -  j  +  r,t). 

k=0  j- 0  t= 0 
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In  fact,  by  first  interchanging  the  summations  over  t  and  j  in  the  first  expression  and  by 
substituting  t  +  j  —  T  for  t  afterwards,  we  obtain 

(T  —  ln—j  —  1  n—  In  —  j  — 1\  t+j  —  T 

E  E  +E  E  E  hjgku(t,t  +  j  -  k  -  t). 

j= 0  t-T-j  j=T  t= o  J  k  = o 

Then,  interchanging  the  summations  over  t  and  k  again  gives 

(r-ln-T-1  n—j  —  l  n  —  lj  —  T—l  n—j  —  1  n  —  ln  —  T  —  1  n—j  —  1  \ 

EE  E  +E  E  E  +E  E  E 

j “0  k=0  t=k—j+r  j~T  k—0  t— 0  j=r  k=j  —  T  t—k—j+r J 

By  finally  interchanging  the  summations  over  j  and  k  in  the  first  and  the  last  terms  and 
combining  them  afterwards,  we  obtain 

(^n  —  T  —  lk+T  n—j  —  1  n  —  lj  —  T—l  n—j  —  l\ 

EE  E  +E  E  E  )  hjgku(t,t  +  j  -k-T). 

^  k — 0  j =0 t—k—j +r  j — t  k= 0  t= 0  J 

Identity  (A. 2)  follows  immediately  by  substituting  t  +  k  -  j  +  r  for  t  in  the  first  term.  Now 
applying  (A. 2)  with  u(t,  s )  =  a 2  cos[ca0(t  +  s)  +  2 <f>\,  we  obtain 

n—r—l  n-1 j—T—1  n—j—1 

E  T2 (<)  =  o-2E  E  hJSk  E  cos[w0(2t  +  j  -  k  -  t)  +  2<f>] 

t= 0  j —T  k—0  t= 0 

n—r—l k+r  n—k—T—1 

t'-EE  hjgk  E  cos[o;o(2t  +  &  -  j  +  r)  +  20] 

ifc=0  j  =0  t—0 

:=  £fi  +  £f2. 

Let  a/-  and  aj  be  the  constants  in  (H2)  associated  with  hj  and  (jj ,  respectively.  Then,  for  each 
j  and  k ,  and  for  any  a  G  A, 

n—j  —  1 

ra_1  hjgk  'jT  cos[a;o(2t  +  -  fc  -  r)  +  20]  <  a^of 

i=0 

and  as  n  ->  oo, 

n—j  —  1 

n~l  ^  cos[o;o(2t  -f  j  —  k  —  r)  T  20]  0. 

t- o 

According  to  assumption  (H2)  and  the  dominated  convergence  theorem,  n~~1U2  ^4'  0  uni¬ 
formly  in  a  G  A.  The  same  assertion  is  also  true  for  U2  by  a  similar  argument.  Therefore, 
n YltZ T2 (t)  -4  0  uniformly  in  a  G  A.  Combining  this  with  (A.l)  yields 

n-1  ]T  h{t)  a-A  a2  K  {jT(w0;  a)<7(w0;  a)e;™°}  (A.3) 

t=0 
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uniformly  in  a  g  A  as  n  — »  oo. 

Write  L,(t)  =  T3(t)  +  TA(l)  where 

t  +  T  t 

T3(t )  :=  (5  E  E  hi9kCt-j+r  cos [u0(t  -  k)  +  <t>\ 

j= 0 k= 0 
t+r  t 

W)  ~  /)£S>i9K.  _ifcCos  [u>0(t-j  +  r)  +  (f>}. 

j= 0  k=0 

Applying  (A. 2)  with  u(t,  s)  =  j3et  cos (u0s  -f  cf> )  gives 

n—T—1  n—lj—T—l  n—j  — 1 

X]  r3(0  =  X]  XI  hi9k  E  cos[w0(t  +  j  -  k  -  t)  +  <f>\  (A. 4) 

t=0  j—T  A: =0  *=0 

n  —  r  —  1  fc-f-r  n  — fc  — r— 1 

+  E  E  hi9t  E  eJ+jfe_i+T  cos^ot  +  <j>) 

k=o  i=o  t= o 

:=  f/3+Cf4- 


Splitting  U3  into  two  terms,  we  get 


(N- 1  n-l\  j  —  T  —  1  n—j  —  l 

E  +  E  E  hj9k  E  cosM* + j  ~  k  —  t)  +  <t>] 

j—T  j-NJ  k= 0 

:=  ujP  +  (ua-u<P). 


t—Q 


For  any  a  g  A, 


N-lj-T-l 


j—T  k= 0 


n-j-1 

E  Qcosu0t 

t=0 


+ 


n-j-1 

E  Q  sin 
t-o 


with  probalbihty  one.  For  each  fixed  j,  it  can  be  shown  [1]  that 


n  —j  --  1 

^  ^  €f  COS  LJq t 
t= 0 


<  Oa  s.  and 

as  n  —+  oo.  Therefore,  for  each  fixed  A, 


n—j  —  1 

T; 

t=0 


<  (V '»■ log nj 


n  lU{3N)  <  00  5.  (  Jlogn/n  )  “-4'  0 


uniformly  in  a  g  A.  This  implies  that  uniformly  in  a  g  A, 


lim  lim  n  1U3N)  =  0  a. s. 

N—>-oo  n-* oo 


(A.5) 
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Moreover,  it  is  readily  seen  that  for  any  a  £  A, 

\u,-v™\  < 

j=N  k- 0  t- o  ' 

n— 1 j—T—1  n— 1 

<  aJaiJ2\e*\ 

j=N  k= 0  t= 0 

n  —  1  oo 

<  dgy: |£,IE»? 

(=0  j=N 

with  probability  one,  where  f?  :=  J2T=oal-  Define  Cw(0  :=  fiG\et\ Y1T=n  aj  ?  then  (Gv(f)}  is 
strictly  stationary  for  each  fixed  iV,  and 

n—  1 


t= 0 


with  probability  one  for  any  a  £  A.  According  to  the  strong  ergodic  theorem  (see,  for  example, 
[8]),  for  each  fixed  jV,  we  have 


n  — 1 


lim  n  1  V  Cw(f )  =  Cn  a.s. 

n. — *-oo  *  -* 


t= 0 


for  some  random  variable  Cn-,  and 


£(Cw)  =  JB{Cw(0)}  =  ^G'E|e0|Eaf- 

j=N 

Since  (n  >  0  with  probability  one,  and 

CO  CO  OO  0O 

E  E(Cn)  =  PGE\e0\  E  E  <  =  E0‘  +  iK  < 

W=0  N=0j  =  N  j= 0 

using  Chebyshev’s  inequality,  we  obtain,  for  any  /i  >  0, 

f  OO  'j  OO  i  oo 

p {  U  (c*  > aO [  <  E  p{Cw>/i}<-  E  £(Cn) - o 

f1  N=N' 


^N=N' 


N=N' 


as  N'  —>  oo.  Therefore,  P{  i.o.  Cn  >  /<}  =  0  for  any  //  >  0,  which  implies  that  Cn  0  as 
N  oo.  Consequently, 

lim  Umsupn-1^!  —  (j\N>\  =  0 


N—*oo 


a.s. 


uniformly  in  a  £  A.  Combining  with  (A.5)  gives  n  1U3  a—>  0  as  n  — >  oo  uniformly  in  a  £  A. 
By  switching  k  and  j  in  the  second  term  of  (A. 4),  U4  can  be  written  as 


fJV-l 


n-T-l\  j 

V'  1  \ 


+T 


u4  =  P  E  +  Y,  )Ylh^j  E  et+j-k+Tcos(uj0t  +  (l>) 

\j= 0  j=N  )  k-o 

:=  U^  +  (U4-UiN)) 


n—j—T— 1 

£ 

t-0 
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Like  U3  %  it  can  be  shown  in  a  similar  way  that 


lim  lim  n  1U4N^  =  0  a.s. 


N— *-co  n— +00 


uniformly  in  a  £  A.  Furthermore,  for  all  a  £  A, 


n-r-l j+T  n-j-T- 1 


\U4-UiN}\  <  ^  E°X  E  l^+i-*+r 


j-N  k~Q  t-0 

n-r-lj+T  n— k— 1 

=  0  E  E°*°'  E  M 

j  =N  fc=0  t  =j  —  k  +  T 

n  —  T—l j+r  n  —  1 

<?E  E»WEW 

j=AT  k= 0  t=0 

n  —  1  oo 

<  ^Ele'lE“’ 

4  =  0  j=N 

with  probability  one,  where  H  :=  ak-  Like  ^3  ~  U^N\  we  have 

lim  limsup  n~1\U4  -  ll{N^\  =  0  a.s. 

N—+00  n-+oo 

Consequently,  n~1UA  a—>'  0  as  n  -*  oo  uniformly  in  a  G  A.  Combining  all  these  results  gives 
n~lHt= 0  uniformly  in  a  £  A.  The  same  result  can  be  established  for  T4(f)  upon 
noting  the  resemblance  between  T3(t)  and  T4(t).  Therefore,  we  have  proved 


E  «<)"■« 


(A.6) 


uniformly  in  a  £  A. 

Finally,  let  us  consider  /3(f).  Using  (A. 2)  with  u(t,s)  =  etes,  we  get 

n—r—l  n—lj—T—1  n-j-1 

E  1 3(^  =  E  E  Ok  ^  ^  ^t^t+j-k-T 

4=0  j=r  fc=0  4=0 

n—r—l fc+r  n— fc— r— 1 

+  E  Em»  E  e4e4+fc-j+r 

*=0  j=0  4=0 

:=  U5  +  U6 


Splitting  U5  into  two  terms  gives 


(/V —  1  n-l\  j  —  T —  1  n-j-1 

E  +  E  E  hjgk  E  Q -f  j  —  A 

j —t  j=N )  k= 0  t=0 

:=  £/f)  +  (^l/f)). 
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a.  s. 


By  strong  ergodicity  (see,  for  example,  [5]  and  [3]), 


n-j-l 

lim  n~l  V  etet+j_k_T  - 

n— ►  oo  z ' 

t=0 


rib-i+r 


for  each  fixed  j  and  k,  where 


On  the  other  hand, 


:=  r  eiTWdF{u). 
J—TT 


JV— 1  j  —  T—1 

N-l j-T—1 

n-j-l 

n~lUiN)  ~E  E  hjgkrl_j+T 

sEE  ai< 

n_1  E  ^t+j-k-T  -  rk-j  +  T 

1! 

»■ 

II 

o 

O 

II 

II 

t= 0 

Therefore,  with  probability  one, 


i™on"1C/sW  =  EE  hj9krl_j+T 


OO  j  —  T—1 


N — ►  oo  n— *oo 


j—T  fc  =  0 
oo  oo 


=  E  E  hj9krl. 


k— 0 7  nfc+T+l 


uniformly  in  o  €  A  Moreover,  for  all  a  €  A, 


j=W  fc=:0  <=o 

n— 1 j—T— 1  n-1 

<  E  E  °j' E  |c»€(+i-*-r| 

j=jv  r-=o  t=o 

n-l  co  j  —  T—1 

<  eee*;  afle<e<+i-fc-r| 

t=Oj=N  kzz  0 

with  probability  one.  Note  that  in  the  last  inequality  the  infinite  sum 

OO  j—T—1 

Zn(1)  -=  ajak\^t^t+j-k-r\ 

j —N  k= 0 

converges  with  probability  one,  since  ZN(t)  >  0,  and  by  the  monotone  convergence  theorem, 

CO  j  —  T—1 

E{zN(t)}  =  E  E  ^+i-*-ri 

j —N  k—0 

oo  j  —  r  —  1  oo 

-  CTe2  E  I]  aM  ^  E  aj  <  °°> 

j  =jV  A: =0  j—N 

where  of  :=  7e(0)  and  the  second  expression  is  a  result  of  Cauchy- Schwarz  inequality.  Note 
also  that  {Zjv(f)}  is  strictly  stationary  for  each  fixed  N ,  and  by  the  strong  ergodic  theorem, 

n-1  E  Z^Ct) 
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as  n  — *  oo  for  some  random  variable  ZN  with  E(ZN)  =  E{ZN(0)}  <  a'^G  YIJLn  aj-  Since 
E{Zn)  <  o(i  +  l)aj  <  oo j  we  have  Zm  0  as  iV  — ►  oo.  Therefore,  uniformly  in 

a  £  A, 

lim  lim sup  n~l 1[/5  —  Ar-) |  =  0  a.s. 

N  ►  oo  n— *-oo 

Combining  these  results  we  get 

oo  oo 

rCllh  -  ]T  ^  hjgkrl_j+T 

k=0  j=k+ r-}-l 

uniformly  in  a  6  i  as  n  -4  oo.  In  a  similar  way,  we  can  also  show  that 


n 


OO  k  +  T 

■‘K.-EEM.-J-i,, 

*=o j =0 


uniformly  iti  a  G  A  as  n  ->  00.  Consequently,  we  have  proved  that  uniformly  in  a  £  A, 

n-r-l  oo  00 


n 


_1  S  J3(0  ^  2  2 

k=0  j=0 


(A.7) 


t- 0 


=  r  H(oj;a)G(io;a)eiTWdF(Lu) 

J  —  IT 


as  n  — »  00.  Note  that  the  last  quantity  is  real  because  of  the  symmetry  of  G(w;a),  H(u>]a), 
and  F(u>)  as  functions  of  u.  Now  (5.6)  is  proved  upon  collecting  (A.3),  (A. 6),  and  (A.7). 


B  Proof  of  Lemma  5.2 

First  of  all,  since  yt(a)  0  for  t  <  0,  f0(a)  and  ^(a)  can  be  rewritten  as 

r0(a)  =  n-lJ2tf(a) 

t= 0 
n  —  2 

n(«)  =  n"1  Y,yt+i(cx)yt(a). 

t= 0 

Define 

n  — 1 

r0(a)  :=  n_1  ^]yt2(a).  (B.l) 

*=0 

Then  it  can  be  shown  that  under  the  conditions  in  Lemma  5.1, 

f0(a)  -  r0(a)  “-4  0 
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uniformly  in  a  G  A  as  n  — >  oo.  In  fact,  it  is  easy  to  see  that 


|f0(o)  -  f0(a)|  =  n  1jfc_ j(a) 

n  —  1 

<  n~lH2  +  2Hn~l'Y^ahj\cn_j_i\ 

3=  0 

n— 1 n— 1 

"t”  ^  ^  ^  ^  ^  \^n—j  —  \^n  —  k  —  \  |  • 

j=0  fc— 0 

Clearly,  the  first  term  tends  to  zero  as  n  — »  oo.  Since  the  variance  of  the  second  term  £n  := 
2Hn~1  Yl]~o  aj\en-j- 1|  is  bounded  by  4HAa2n~2  and  hence  Var{£„}  <  oo,  we  obtain  £n  -4’  0. 
To  show  that  the  last  term,  denoted  by  Qn,  also  vanishes  with  probability  one,  let  us  rewrite  6n 
as 


6„  —  ri 


n  — 1  /j-1  j  \ 

1 (S+Xj  aJak\€" 

j=  0  \k  = 0  fc=0/ 


-j  —  Un  —  fc  —  1 1 


JV-1  /j-1  j  \ 

<  n_1  X]  S  +  X!  °j  °tl€»-i-l€n-fc-l| 

j=0  Vfc=0  ife=0/ 

j=N  \k= 0  k=0/  t—0 

:=  e)+(«n-e)). 


By  strong  ergodicity, 


n  —  1 


n  —  2 


71  ^n-j-l^n-fc-1 


n 


_1  X  n  1  X)  U-jft- 


t=o 


*  =  0 


a. 5.  f 

— ►  r 


k-j  ~  rk-j  ~  0 


for  each  fixed  j  and  k.  Therefore,  0^  vanishes  as  n  — *  oc  for  each  fixed  N,  and  thus 


lim  lim  6[N)  =  0  a.s. 

N—+oo  n— ►  oo 

Moreover,  applying  the  same  technique  used  to  derive  the  limit  of  n-1(f75  —  U^)  in  Lemma  5.1 
gives 

lim  lim  (9n  -  6(nN))  =  0  a.s. 

Consequently,  9n  -4  0  as  n  — >■  oo.  Combining  all  these  results  proves  the  assertion  that  f0(a) 
and  f0(a)  are  uniformly  equivalent  as  n  tends  to  infinity.  Using  a  similar  argument,  it  can  be 
shown  that  the  uniform  equivalence  also  holds  for  r'0(a)  and  /^(a)  :=  2 n~1  J2t=o  y't(o)yt{a). 
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Because  of  the  uniform  equivalence,  it  suffices  to  show  the  consistency  of  f0(a),  r'Q(a ),  rt(a), 
and  f'^a).  Note  that  from  (5.1),  (5.4),  and  (B.l),  the  following  identities  hold: 

f0(a)  =  n-1^  \J2hj(a)yt-i )  [Hhi(a)yt-i 

t= 0  \i= 0  /  0 

n  — 2  /  t+1  \  /  < 

f'i(«)  =  n-1  S^j(«)»<-J+1  )  (  ^2hj(a)yt-i 

t= 0  \j= o 


n  — 1  /  t 


vi=° 

t 


f'0(a)  =  2n  1  X]  I  1  EE  M  «)?/»- j 


(=0  y=o 

n-2  / 1+1 


W=° 

i 


fi(«)  =  "‘E  E^h+i  EW' 


t—j 


t—0  \j=o 


ij=0 


n-2  ( (  +  1 

+  n~l  EE  ( 1  I  Ylh'j(a)yt-j  ]  • 

1=0  \j= 0  j  \j=  0 


Applying  Lemma  5.1  to  these  quantities  proves  the  uniform  consistency. 
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